Failure Recovery based on Quasi-Synchronous Checkpointing in Mobile Computing Systems
نویسنده
چکیده
Mobile computing systems are expected to revolutionize the way computers are used. Mobile hosts have small memory, a relatively slow processor and low power batteries, and communicate over low bandwidth wireless communication links. In this paper, we address the problem of failure recovery in mobile computing systems. Any recovery method for mobile computing systems should take into consideration energy and communication bandwidth constraints under which mobile computers have to operate. Synchronous checkpointing is not suitable for mobile systems since it involves high communication cost over a low bandwidth network. Asynchronous checkpointing is not suitable because multiple checkpoints need to be stored in the stable storage and also some or all of the checkpoints taken may be useless for constructing consistent global checkpoints. In this paper, we propose a low-overhead recovery algorithm based on a quasi-synchronous checkpointing algorithm for mobile computing systems. The checkpointing algorithm preserves process autonomy by allowing them to take checkpoints asynchronously and uses communication-induced checkpoint coordination for the progression of the recovery line which helps bound rollback propagation during a recovery. Thus, it has the easeness and low overhead of asynchronous checkpointing and the recovery time advantages of synchronous checkpointing. The checkpointing algorithm ensures the existence of a recovery line consistent with any checkpoint of any process all the time. The recovery algorithm exploits this feature to restore the system to a state consistent with the latest checkpoint of a failed process, in the event of a failure. It uses selective pessimistic message logging at the receiver end to handle the messages lost due to rollback.
منابع مشابه
Efficient Checkpoint-based Failure Recovery Techniques in Mobile Computing Systems
Conventional distributed and domino effect-free failure recovery techniques are inappropriate for mobile computing systems because each mobile host is forced to take a new checkpoint (based on coordinated checkpointing). Otherwise, multiple local checkpoints may need to be stored in stable storage (based on communication-induced checkpointing). Hence, this investigation presents a novel domino ...
متن کاملAn Enhanced MSS-based checkpointing Scheme for Mobile Computing Environment
Mobile computing systems are made up of different components among which Mobile Support Stations (MSSs) play a key role. This paper proposes an efficient MSS-based non-blocking coordinated checkpointing scheme for mobile computing environment. In the scheme suggested nearly all aspects of checkpointing and their related overheads are forwarded to the MSSs and as a result the workload of Mobile ...
متن کاملA New Checkpointing Approach for Mobile Distributed System
In this paper, we introduce a weighted checkpointing approach for the mobile distributed computing system (MDCS) that significantly reduces checkpointing overheads on mobile nodes. Checkpoint protocols proposed so far in the literature for MDCS are either coordinated, log based or quasi-synchronous. Coordinated checkpointing requires extra synchronization messages and may block the underlying c...
متن کاملAn Efficient Checkpointing Scheme for Mobile Computing Systems
In this paper, an efficient checkpointing scheme is proposed for mobile computing systems. It uses communication-induced checkpointing with the exception that basic checkpoints are not taken. A new concept of forced checkpoints, different from the existing one, is incorporated to ensure a reduced reexecution after recovery. The proposed algorithm runs periodically to find the set of the globall...
متن کاملComprehensive Low-overhead Process Recovery Based on Quasi-synchronous Checkpointing
In this paper, we propose a low-overhead recovery algorithm based on a quasi-synchronous checkpointing algorithm. The checkpointing algorithm preserves process autonomy by allowing them to take checkpoints asynchronously and uses communication-induced checkpoint coordination for the progression of the recovery line which helps bound rollback propagation during a recovery. Thus, it has the easen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996